462 research outputs found
Understanding maximal repetitions in strings
The cornerstone of any algorithm computing all repetitions in a string of
length n in O(n) time is the fact that the number of runs (or maximal
repetitions) is O(n). We give a simple proof of this result. As a consequence
of our approach, the stronger result concerning the linearity of the sum of
exponents of all runs follows easily
Fewest repetitions in infinite binary words
A square is the concatenation of a nonempty word with itself. A word has
period p if its letters at distance p match. The exponent of a nonempty word is
the quotient of its length over its smallest period.
In this article we give a proof of the fact that there exists an infinite
binary word which contains finitely many squares and simultaneously avoids
words of exponent larger than 7/3. Our infinite word contains 12 squares, which
is the smallest possible number of squares to get the property, and 2 factors
of exponent 7/3. These are the only factors of exponent larger than 2. The
value 7/3 introduces what we call the finite-repetition threshold of the binary
alphabet. We conjecture it is 7/4 for the ternary alphabet, like its repetitive
threshold
Foreword
Aprovats per la ComissiĂł de Govern de 01-12-200
Quasiperiodicities in Fibonacci strings
We consider the problem of finding quasiperiodicities in a Fibonacci string.
A factor u of a string y is a cover of y if every letter of y falls within some
occurrence of u in y. A string v is a seed of y, if it is a cover of a
superstring of y. A left seed of a string y is a prefix of y that it is a cover
of a superstring of y. Similarly a right seed of a string y is a suffix of y
that it is a cover of a superstring of y. In this paper, we present some
interesting results regarding quasiperiodicities in Fibonacci strings, we
identify all covers, left/right seeds and seeds of a Fibonacci string and all
covers of a circular Fibonacci string.Comment: In Local Proceedings of "The 38th International Conference on Current
Trends in Theory and Practice of Computer Science" (SOFSEM 2012
The Rightmost Equal-Cost Position Problem
LZ77-based compression schemes compress the input text by replacing factors
in the text with an encoded reference to a previous occurrence formed by the
couple (length, offset). For a given factor, the smallest is the offset, the
smallest is the resulting compression ratio. This is optimally achieved by
using the rightmost occurrence of a factor in the previous text. Given a cost
function, for instance the minimum number of bits used to represent an integer,
we define the Rightmost Equal-Cost Position (REP) problem as the problem of
finding one of the occurrences of a factor which cost is equal to the cost of
the rightmost one. We present the Multi-Layer Suffix Tree data structure that,
for a text of length n, at any time i, it provides REP(LPF) in constant time,
where LPF is the longest previous factor, i.e. the greedy phrase, a reference
to the list of REP({set of prefixes of LPF}) in constant time and REP(p) in
time O(|p| log log n) for any given pattern p
Identifying all abelian periods of a string in quadratic time and relevant problems
Abelian periodicity of strings has been studied extensively over the last
years. In 2006 Constantinescu and Ilie defined the abelian period of a string
and several algorithms for the computation of all abelian periods of a string
were given. In contrast to the classical period of a word, its abelian version
is more flexible, factors of the word are considered the same under any
internal permutation of their letters. We show two O(|y|^2) algorithms for the
computation of all abelian periods of a string y. The first one maps each
letter to a suitable number such that each factor of the string can be
identified by the unique sum of the numbers corresponding to its letters and
hence abelian periods can be identified easily. The other one maps each letter
to a prime number such that each factor of the string can be identified by the
unique product of the numbers corresponding to its letters and so abelian
periods can be identified easily. We also define weak abelian periods on
strings and give an O(|y|log(|y|)) algorithm for their computation, together
with some other algorithms for more basic problems.Comment: Accepted in the "International Journal of foundations of Computer
Science
A fast implementation of the Boyer-Moore string matching algorithm
Manuscript, http://www-igm.univ-mlv.fr/~lecroq/articles/cl2008.pd
Finite-Repetition threshold for infinite ternary words
The exponent of a word is the ratio of its length over its smallest period.
The repetitive threshold r(a) of an a-letter alphabet is the smallest rational
number for which there exists an infinite word whose finite factors have
exponent at most r(a). This notion was introduced in 1972 by Dejean who gave
the exact values of r(a) for every alphabet size a as it has been eventually
proved in 2009.
The finite-repetition threshold for an a-letter alphabet refines the above
notion. It is the smallest rational number FRt(a) for which there exists an
infinite word whose finite factors have exponent at most FRt(a) and that
contains a finite number of factors with exponent r(a). It is known from
Shallit (2008) that FRt(2)=7/3.
With each finite-repetition threshold is associated the smallest number of
r(a)-exponent factors that can be found in the corresponding infinite word. It
has been proved by Badkobeh and Crochemore (2010) that this number is 12 for
infinite binary words whose maximal exponent is 7/3.
We show that FRt(3)=r(3)=7/4 and that the bound is achieved with an infinite
word containing only two 7/4-exponent words, the smallest number.
Based on deep experiments we conjecture that FRt(4)=r(4)=7/5. The question
remains open for alphabets with more than four letters.
Keywords: combinatorics on words, repetition, repeat, word powers, word
exponent, repetition threshold, pattern avoidability, word morphisms.Comment: In Proceedings WORDS 2011, arXiv:1108.341
- âŠ